Speaker adaptation using combined transformation and Bayesian methods

نویسندگان

  • Vassilios Digalakis
  • Leonardo Neumeyer
چکیده

Adapting the parameters of a statistical speaker-independent continuous-speech recognizer to the speaker and the channel can significantly improve the recognition performance and robustness of the system. In continuous mixture-density hidden Markov models the number of component densities is typically very large, and it may not be feasible to acquire a sufficient amount of adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed a constrained estimation technique for Gaussian mixture densities. To improve the behavior of our adaptation scheme for large amounts of adaptation data, we combine it here with Bayesian techniques. We evaluate our algorithms on the large-vocabulary Wall Street Journal corpus for nonnative speakers of American English. The recognition error rate is approximately halved with only a small amount of adaptation data, and it approaches the speaker-independent accuracy achieved for native speakers. V. Digalakis L. Neumeyer TEL +30-821-46566 x226 TEL +1-415-859-4522 FAX +30-821-58708 FAX +1-415-859-5984 [email protected] [email protected] Electronic and Computer Engineering Dept., Technical University of Crete Kounoupidiana, Chania, 73100 GREECE SRI International 333 Ravenswood Ave. Menlo Park, CA 94025, USA

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Speaker Clustering in Eigenspace

In this paper we propose a speaker clustering scheme working in ’Eigenspace’. Speaker models are transformed to a low-dimensional subspace using ’Eigenvoices’. For the speaker clustering procedure simple distance measures, e.g. Euklidean distance can be applied. Moreover, clustering can be accomplished with base models (for Eigenvoice projection) like Gaussian Mixture Models as well as conventi...

متن کامل

ACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION by

ACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION Jidong Tao, B.Eng., M.S. Marquette University, 2009 Automatic speech recognition (ASR) converts human speech to readable text. Acoustic model adaptation, also called speaker adaptation, is one of the most promising techniques in ASR for improving recognition accuracy. Adaptation works by tuning a g...

متن کامل

Improved Bayesian learning of hidden Markov models for speaker adaptation

We propose an improved maximum a posteriori (MAP) learning algorithm of continuous-density hidden Markov model (CDHMM) parameters for speaker adaptation. The algorithm is developed by sequentially combining three adaptation approaches. First, the clusters of speaker-independent HMM parameters are locally transformed through a group of transformation functions. Then, the transformed HMM paramete...

متن کامل

Online Bayesian tree-structured transformation of HMMs with optimal model selection for speaker adaptation

This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform or adapt a set of hidden Markov model (HMM) parameters for a new speaker and gain large performance improvement from a small amount of adaptation data. By constructing a clustering tree of HMM Gaussian mixture components, the linear...

متن کامل

On-line Bayesian speaker adaptation using tree-structured transformation and robust priors

This paper presents new results by using our recently proposed on-line Bayesian learning approach for affine transformation parameter estimation in speaker adaptation. The on-line Bayesian learning technique allows updating parameter estimates after each utterance and i t can accommodate flexible forms of transformation functions as well as prior probability density function. We show through ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 4  شماره 

صفحات  -

تاریخ انتشار 1995